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Abstract 

We consider the problem of estimating the direction of arrival of a signal embedded 
in A^-distributed noise, when secondary data which contains noise only are assumed to be 
available. Based upon a recent formula of the Fisher information matrix (FIM) for complex 
elliptically distributed data, we provide a simple expression of the FIM with the two data 
sets framework. In the specific case of AT-distributed noise, we show that, under certain 
conditions, the FIM for the deterministic part of the model can be unbounded, while the 
FIM for the covariance part of the model is always bounded. In the general case of elliptical 
distributions, we provide a sufficient condition for unboundedness of the FIM. Accurate 
approximations of the FIM for A'-distributed noise are also derived when it is bounded. 
Additionally, the maximum likelihood estimator of the signal DoA and an approximated 
version are derived, assuming known covariance matrix: the latter is then estimated from 
secondary data using a conventional regularization technique. When the FIM is unbounded, 
an analysis of the estimators reveals a rate of convergence much faster than the usual T -1 . 
Simulations illustrate the different behaviors of the estimators, depending on the FIM being 
bounded or not. 
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1 Problem statement 


Estimating the direction of arrival (DoA) of multiple signals impinging on an array of sensors 
from observation of a finite number of array snapshots has been extensively studied in the 
literature |1 . Maximum likelihood estimators (MLE) and Cramer-Rao bounds (CRB), derived 
under the assumption of additive white Gaussian noise, and either for the so-called conditional 
or unconditional model |2jj5], serve as references to which newly developed DoA estimators have 
been systematically compared. In many instances however, additive noise is usually colored 
and, consequently, the problem of DoA estimation in spatially correlated noise fields has been 
studied, see e.g., [6-10 


When the spatial covariance matrix of this additive noise is known a priori, maximum likeli¬ 
hood estimators and Cramer-Rao bounds are changing in a straightforward way with whitening 
operations. The new statistical problem appears when the covariance matrix of the additive 
noise is not known a priori and information about this matrix is substituted by a number of 
independent and identically distributed (i.i.d.) training samples, that form the so-called sec¬ 
ondary training sample data set. In many cases one can assume that the statistical properties 
of the training noise data are the same as per noise data within the primary training set data: 
such conditions are usually referred to as the supervised training conditions. Therefore, under 
these conditions, one has two sets of measurements, one primary set X E C MxT p which contains 
signals of interest (SOI) and noise, and a second set Y E C MxTs (secondary training set) which 
contains noise only. Examples of this problem formulation are numerous in the area of passive 
location and direction finding. For instance, in the so-called over-sampled 2D HF antenna ar¬ 
rays, ionospherically propagated external noise is spatially non white HI, 12] , and some parts of 


HF spectrum (distress signals for example) with no signals may be used for external noise sam¬ 
pling [13] . Despite its relevance in many practical situations, this problem has been relatively 
scarcely studied [14, 15j. For parametric description of the Gaussian noise covariance matrix 
with 6 n the unknown parameter vector, in [l5|, the authors derive the Cramer-Rao bound for 
joint SOI parameters (DoA) 6 s and noise parameters 6 n estimation, assuming a conventional un¬ 
conditional model, i.e., X ~ CAf (0. A(<fi)T A(<fr) H + R n , It p ) and Y ~ CA/"(0,R n , It s ) where 


CAT (.,.,.) stands for the complex Gaussian distribution whose respective parameters are the 
mean, row covariance matrix and column covariance matrix. A ((f)) is the usual steering ma¬ 
trix with 0 the vector of unknowns DoA, T denotes the waveforms covariance matrix and R n 
corresponds to the noise covariance matrix, which is parameterized by vector 0 n . 

In many cases however, the Gaussian assumption for the predominant part of the noise cannot 
be advocated. Typical example is the HF external noise, heavily dominated by powerful lighting 
strikes 


16-18 


Evidence of deviations from the Gaussian assumption has been demonstrated 
numerous times for different applications, with the relevance of the compound-Gaussian (CG) 
models being justified 19-24 . In essence, the individual M-variate snapshot of such a noise 


over the face of an antenna array may be treated as a Gaussian random vector, whose power can 
randomly fluctuate from sample to sample. CG models belong to a larger class of distributions, 
namely multivariate elliptically contoured distributions (ECD) [25-27 . For the sake of clarity, 


we briefly review the main definitions of ECD. A vector x E C M follows an EC distribution if it 
admits the following stochastic representation 


x = m + y^QAu (1) 

where = means “has the same distribution as”. In Q is a non-negative real random variable 
and is independent of the complex random vector u which is uniformly distributed over the 
complex sphere C S M = ju E C M ; ||u|| = lj. The matrix A is such that AA H = R where R is 
the so-called scatter matrix, and we assume here that R is non-singular. The probability density 
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function (p.d.f.) of x can then be written as 

p(x|m, R, g) oc |R | —1 <7 ((x — m)^R _1 (x — m)) (2) 

where oc stands for proportional to. The function g : M + —> M + is called the density generator 
and satisfies finite moment condition 5m, g = g{t)dt < 00. It is related to the p.d.f. of 

the modular variate Q by p(Q) = 5^ g Q M ~ 1 g(Q). 

Going back to our scenario of two data sets X = [x 4] ... xy p ] and Y = [y tl ... yx s ] , 
we assume that they are independent, and that their columns are independent and distributed 
(i.i.d.) according to ©>• In other words, one has x tp = m tp + ^/Ch~Au tp and y ts = sfQtl Au (s , 
where Qt v and Qt a are i.i.d. variables drawn from p(Q) oc Q M ~ X g{Q), and u tp and u ts 
are i.i.d. random vectors uniformly distributed on the unit sphere. It then follows that 
the joint distribution of X, Y is given by p(X,Y|M,R, g) = p(X|M, R, g)p(Y|R, g) where 
M = [m(l) ... m(T p )] T and 

T v 

p(X|M, R, g) oc |R| -Tp g (zgR- 1 ^) (3a) 

t p =1 
T s 

p(Y|R, g) oc |R|“ Ts JJ g (y^R _ 1 y ti ) (3b) 

t s =1 


where z t = x* — . Additionally, we assume that M depends on a parameter vector 6 s while 


R depends on 6 n . Our objective is then to estimate 6 = 


from (X, Y). Let us emphasize an 


essential difference of the problem in ([ 3 ]) with respect to the typical problem of target detection 
in CG clutter 28 . There, within each range resolution cell the clutter is perfectly Gaussian and 


therefore the optimum space-time processing is the same as per the standard Gaussian problem 
formulation. It is the data dependent threshold and clutter covariance matrix (in adaptive 
formulation) that needs to be calculated from the secondary data, if not known a priori |;28,29|. 
In the problem ([ 3 ]), the SOI DoA estimation should be performed on a number T p of ECD i.i.d. 
primary training samples, and maximum likelihood DoA estimation algorithm and CRB should 
be expected to be very different from the Gaussian case. 

The paper is organized in the following way. In Section [2j we derive a general expression of 
the FIM for elliptically distributed noise using two data sets. Section [3] focuses on the case of 
DoA estimation in Ji-distributed noise. In section 3.2 we derive conditions under which the FIM 
is bounded/unbounded, and provide a sufficient condition for unboundedness of the FIM with 
general elliptical distribution. The maximum likelihood estimate, as well as an approximation, 
are derived in section 3.3 In the same section, we derive lower and upper bounds on the mean- 
square error of the MLE for non-regular estimation conditions, i.e., when the Fisher information 
matrix is unbounded. Numerical simulations serve to evaluate the performance of the estimators 
in Section [4] and our conclusions are drawn in Section [5j 
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2 Cramer-Rao bounds 


In this section, we derive the CRB for estimation of parameter vector 6 from the distribution 
in Q. The Fisher information matrix (FIM) for the problem at hand can be written as [l 

,, c fSlogp(X,Y|M,R, 9 )91ogp(X,Y|M,R, 9 ) 

F(, ’* ) = £ {- se, -- 

_ 5 f aiogp(X|M,R,g)aiogp(X|M,R,g) 




89 


89b 


+ g{ d lo gP( Y l R ,sO Slogp(Y|R, g ) 


89i 


89 k 


where we used the fact that 


(4) 


5 f aiogp(X|M,R, g )aiogp(Y|R,g) ) _ Q 

l 89j 89 k 


(5) 


Hence, the total FIM is the sum of two matrices F = F x + F y , with straightforward definition 
from Q. In order to derive each matrix, we will make use of the general expression of the Fisher 
information matrix for ECD recently derived in 130,31]. First, let us introduce 

( 6 ) 


atl = £{Q^\Q t )} 


where <f>(t) = Then, we have from [30 1 that the ( j , /c)-th element of the Fisher information 

matrices is given by 


F *(^) = ^r£ R e{ 


dm? 8m t 

_p—1 l V 1 

89i 89k 


+ T t 

+ 


a 2 


p [M(M + 1 ) 
a 2 T p 


- 1 


^{R-^jjltlR 1 R fc } 


M(M + 1 ) 


Tr{ R _ 1 Rj R _ 1 R fe } 


(7) 


F y(j,k) = T s 


+ 


a 2 


- 1 


_M(M + 1 ) 

T s rn. rn— Ity tj — 1 


Tr{R _1 Rj}Tr{R _1 Rfc} 


M(M + 1 ) 


TrlR-'RjR-'Rj 


( 8 ) 


where R, = PP. Since R depends only on 6 n , it follows that F„ takes the following form 


de 


Fy = 


0 0 
0 F™j 


(9) 


with 


F ™(j,k)=T t 

+ 


OL 2 


- 1 


_M(M + 1 ) 

° 2Ts Tt{R _1 R”R _1 R^} 


TrjR-^nTrjR-^n 


M(M + 1 ) 


( 10 ) 
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where R” = Jp-. Let us now consider F x . Using the fact that R depends only on 9 n and mf p 
depends only on 9 s , F x is block-diagonal, i.e., 


F,„ = 


■pss 

X 


-pnn 


with 


2ai ^ dmj 1 dm t 


u =1 


i>K 


F TU,k) = T, 


012 


+ 


p [M(M + 1) 
OL2T p 


- 1 


Tr{R _ 1 R” } Tr { R _ 1 R” } 


-1d»1 


M(M + 1) 
The whole FIM is thus given by 

F = 


Tr{R -1 R"R -1 R£}. 


K s 

o 


t pnn _i_ "pn 
F x ' F y 


(ID 


(12a) 


(12b) 


(13) 


The CRB for estimation of 0 s is obtained as the upper-left block of the inverse of the FIM and 
is thus simply CRB(9 S ) = (F 8S ) _1 . Similarly to the Gaussian case, the CRB for estimation of 
9 s in the conditional model is the same as if R was known. As for the CRB for estimation of 
9 n , it is the same as if we had a set of T = T„ + T S noise only samples. 


3 Application to /i-distributed noise 

3.1 Data model 

We address the specific problem where the primary data can be written as 

= ™t p + ^n tp 


(14) 


where Tt p follows a Gamma distribution with shape parameter v and scale parameter /3, i.e., its 
p.d.f. is given by 


2>(%) = 


(15) 


which we denote as r t ~ Q and nt p ~ CAA(0,R). The noise component is known to 

follow a K distribution and x^ p in (14) admits a CES representation similar to 0 with Qt p = 
Q {y,f3) x C Xm- The p.d.f. of Qt p in this case is given by 


p(Qt p ) = 


2 /3~( U + M )/ 2 v+m. i 


f + M -j / /- 

Q tp 2 Km-v ( 2 \JQ tf 


r{u)T(M) 

where Km-u(-) is the modified Bessel function. Note that the fi-th. order moment of Qt p is 

2j3~(v+M)/2 


(16) 


£ 


R} 


r(i/)T(M) Jo tp 
ro ° 


30 c£^ X K M -v (2 yjQt p /P ) dQ tp 


no o 

/ z 2p+u+M ~ 1 K M -u{z)dz 

Jo 


2 2 v+»+M -2 r(u)T(M) Jo 

m) M) {m + millfo, M) > 0) 


( 17 ) 
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where we used the fact 132 s 6.656.16] that 

i 

z^K u (z)dz = 


f 


2 M-lp p fl+l±U>0 

otherwise 


(18) 


oo 


The density generator is thus here 

g(Q ) = Q^Km-u (zVW) ( 19 ) 

where, for the sake of notational convenience, we have dropped the subscript t p ■ 

3.2 Cramer-Rao bounds 

The FIM for A'-distributed noise can be obtained from the FIM for Gaussian distributed noise 
and the calculation of the scalar 




g'(Q ) 

g(Q) 


n 2 


( 20 ) 


for fj, G {1, 2}. For the signal parameters part only, we indeed have = M ^iFg where the 
subscript k and g stand for A"-distributed and Gaussian distributed noise. Using the fact that 
K'a(z) = “ K a (z ) - K a+ i(z), it follows that 

V — M „ v-M 


g\Q ) = v -^-QT^- x k M -u ( 2 y/Qjp) + Q"^r 1/2 K' M -u (2 


V — M _ v-M 


Q 2 1 Km-v ( 2 

r M-v 


2 

v-M-l 


+ q^^p~ 1/2 


= -Q 2 


2\/W 

pr y 2 K M+1 - v (2 


Km-v 2 


— Km+i-v (2 


( 21 ) 


It then ensues that 


m = = ‘r 1 *r l * Ku+l " ^ 


Km-v (2VW) 


( 22 ) 


and thus 



(23) 


A formula for the FIM in case of A'-distributed noise was derived in |33j based on the compound 
Gaussian representation ©• While it resembles our derivations based on the FIM for ECD 
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derived in 30 , it does not match exactly our expression herein. Moreover, we study herein the 
existence of the FIM and derive a closed-form, approximation of the FIM. 

Let us investigate the conditions under which the integral 




K 2 (A 


(24) 


converges. Towards this end, let us use the following inequality which holds for M + 1 — v > 1 
and z > 0 34 

k m+ !_„(-) (M + 1 - 0 + + (M +1 - O 2 


K M -v{z) 


> 


> 


M+l—v „ 
M-i/ 25 




M+l-i/ , 
M—v " 


M — v 
M + l — v 


1/2 


+ (M - z/)£ 


-1 


(25) 


It follows that 


4> 


M -v 
M + l — v 


\ 1/2 /*oo 

) Jo z 2 fl+U+M - 3 R M+i-Az)dz 


+ (M - J/) / z 2 ' ,+ ^ M — 4 A m+1 _„ ( 2 ) dz 

Jo 


(26) 


The first integral converges for 2/x + ^ + M — 2 — M— l + z/>044-/t + z/ — | >0 while the 
second converges for 2p + v + M — 3 — M — l + v > 0 p + v — 2 > 0. Hence, for p + v — 2 > 0, 
one has 


(; WTTA ) V2 + m - ijro. + * - |) 

+ (M - ^)2 2/i+l/+M - 5 r(/r + M - 1 )T{p + 1 / — 2). 
Accordingly, one has, for 2 : > 0 


4> 


(27) 


Ajw+1— viz) {M + l — v) + yjz 2 + (M + 1 — v) 2 

I<M-u(z) z 

< 2(M + 1 - z/)z _1 + 1 


(28) 


which implies that 

/•OO 

4 < 2(M + l-i/) / ^ +v+M - 4 i( M+1 _„ (*) dz 

Jo 

/•oo 

+ / ^ +M - 3 A m+ i-„(z). (29) 

Jo 

The first integral converges for p + v — 2 > 0 and the second converges for p + z/ — § > 0. In the 
former case, one has 


I fl <(M + l- v)2 2 » +u+M - 4 T{p + M- 1 )T(p + v-2 ) 

+ 2 2 ^ +M - 4 T(/i + M - J)T(p + v - |). (30) 
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Consequently, we conclude that the integral converges only for p + v — 2 > 0: for p = 2 this 
implies that v > 0 which is verified. In contrast, when /z = 1, one must have v > 1. In other 
words, the term in the FIM corresponding to the noise parameters is always bounded since it 
depends on I 2 only. The situation is different for signal parameters. In an unconditional model 
where R would depend on signal parameters as well, the FIM is bounded. In contrast, in the 
conditional model where signal parameters are embedded in the mean of the distribution, the 
FIM corresponding to signal parameters is bounded only for v > 1: otherwise, it is unbounded. 
The latter case corresponds to the so-called non regular case corresponding to distributions with 


singularities, as studied e.g., in 35 


Before pursuing our study of the FIM for the specific case of JF-distributed noise, let us make 
an important observation. For the K distribution, we have just proven that I\ does not exist 
for v < 1. However, see £ j exists if and only if p + M >0 and p + v > 0. The 

latter condition implies that, when v < 1, £ jQjp 1 } = g fo° Q AI 2 9{Q)dQ does not exist. 
Observe that convergence of the latter integral is problematic in a neighborhood of 0, since for 
Qo > 1, Jg () Q M ~ 2 g{Q)dQ < Jq o Q AI ~ l g(Q)dQ < 5 m, g as p(.) is a density. Therefore, at least 

for A'-distributed noise, if £ {Qt^, 1 } does not exist, then £ {Qt p ^ 2 (Qt p )} is unbounded. At this 
stage, one may wonder if this property extends to any other elliptical distribution. It turns out 
that this is indeed the case, as stated and proved in the next proposition. 

Proposition 1. Whatever the p.d.f. of the modular variate Qt p , if £ {Qq, 1 } = 00 then 
£ {Qt p f> 2 {Qt p )} = 00. 

Proof. For the sake of notational convenience, we temporarily omit the subscript t p and use Q 
instead of Qt . Let us first observe that 


£{Qf) 2 (Q)}= Qf> 2 (Q)p{Q)dQ= Q~ 1 fj 2 {Q)p{Q)dQ. 


(31) 


Since p{Q) = L d{Q )> one can write 


which implies that 


iKQ) = -Q = _ Q d ^9{Q) 
g(Q) dQ 


= -Q 


3 In Q 1 Iw p(Q) 
dQ 


= -(1 - M)Q d -fff - Q &,np(Q) 


= (M- 1) -Q 


dQ dQ 

d\wp(Q) 
dQ ' 


(32) 


Q-V(Q) = (M - 1 YQ- 1 - 2(M - 1) 


s d\np(Q) 
dQ 


+ Q 


d\np(Q) 
dQ 


1 2 


(33) 













Therefore 


" h Q<p 2 (Q)p(Q)dQ = (M - l) 2 [ h Q~ 1 p(Q)dQ - 2(M - 1) f d]n ^ Q>) p(Q)dQ 

i J a J a 

2 


dQ 


+ 


Q 


d\np(Q) 

dQ 


p(Q)dQ 


rb rb 

= (M - l) 2 / Q~ 1 p(Q)dQ - 2(M - 1) / p'(Q)dQ 

J a J a 


+ 


Q 


dlnp(Q) 

dQ 


p{Q)dQ 


= (M - l) 2 / Q~ l p{Q)dQ - 2(M - 1) [p(b) - p(a)} 


+ 


Q 


dlnp(Q) 

dQ 


p{Q)dQ. 


(34) 


The third term of the sum is always positive. In the second term, we have that lim b^ooP(b) = 0. 

It follows that divergence of Q~ 1 p(Q)dQ is a sufficient condition for divergence of J ^ Q(fi 2 (Q)p(Q)dQ. 

As said before lim^oo Q~ l p{Q)dQ exists, and therefore a sufficient condition for £ {QJ) 2 {Q)} 
to be undounded is that lim a _>.o J a °° Q~ 1 p(Q)dQ = £ {Q -1 } is unbounded. □ 

Let us now go back to the A'-distributed case and investigate whether it is possible to derive 
a simple expression for 1^ and subsequently assuming that p + v — 2 > 0. Towards this end, 
let us make use of 

2(M - v) 


Km+i-*(z) = 


+ Km-I-v(z) 


(35) 


to write that 


4 = 


[°° 2n+v+M-3 K M+l-v ( g ) ■ 

Jo ^ Km-u{z) 

noo 

4(M — v) 2 / z 2 ^ +v+M ~ 5 K M -u(z)dz 

Jo 


+ 4(M-i/) / z 2 ^ +u+M - A K M -i-u{z)dz 

Jo 


+ / z 

Jo 


2/i+iH-M—3 


—3 ( z ) 


dz 


Km-w (z) 

= 22m+^+m- 4 ( M _ + M- 2 )T(p + 12-2) 

+ 2 2 ^ +u+m ~ 3 {M - u)T(p + M- 2 )T{p + 12- 1) 

poo rv2 / y \ 

' 2 ^+u+M-Z , 

Km—v {z) 


+ 


(36) 


The last term is obviously not possible to obtain in closed-form so that we use a “large M — 12 ” 
approximation of the modified Bessel function 361 


Km-u(z) ~ 


7T 


ez 


2 (M- 12 ) \2(M — 12 ) 


-(M-v) 


(37) 
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which results in 


Km-I-u (z) ~ . 

f M -is N 

V2 

(AT - 1 - v) M ~ 1 “ 1 ') ez 

K M - u (z) - 1 

^ M — 1 — is j 

I 

(M - is) M -») 2 


( M-is ' 

V2 

1 ez 

— I 

\M — 1 — IS y 

I 

e(M-is) 2 


2(Af-i/) 1 / 2 (Af- 1 -i/) 1 / 2 ' 


( 38 ) 


Therefore, 

r 2fj,+v+M—3 ( z ) 

Km-u {z) 


f 


dz ~ 


1 

2(M-z/) 1 /2(M-1-j/) 1 /2 
22^+z/+M—4 


/»oo 

/ (*) d* 

Jo 


(M — z/) 1 / 2 (M — 1 — z/) 1 / 2 


T(/i + M — l)r(/z + y) 


We finally have 


C^/y, — 


P»~ 2 L 


m 2 2 ^+M- 4 r(M)T(z/) 

2 T(/i + M — 2)T(/r + z/ — 2) 


~ 2 {M-isY- 
+ 2 ^~ 2 (M -is) 


r (M)T(i/) 

T(/z + M — 2 )T(/i + z/ — 1) 


r(M)r(i/) 

+ /3^ 2 (M - ^)-V 2 (M - 1 - I/)-i/2 r (M + M- l)r( M + t /) 

If the large M — is approximation is made from the start, then one has 

~ 2(M + 1 - ^(M - z^) 1 / 2 ^- 1 

Km-v yz) 


so that 


(39) 


(40) 


(41) 


and hence 


4 ~ 2(M + 1 - is) l '\M - is) 1 / 2 / 2 2 ^ + " +m - 4 X m +i-, (z) dz 

Jo 

= 2 2 ^ +u+m ~ a (M + 1 - is) 1 / 2 (M - is) 1 / 2 r(/Lt + M - l)T(/z + 1 / - 2) 
a, =, p-*(M + 1 - 0 1/2 (M - F) i/2 r(M + M r - l)r(M + .-2) 


(42) 

(43) 


Figure [T] compares the approximations in (40) and (43), as well as a method which uses 
random number generation to approximate based on its initial definition in (20). More 

precisely, we generated a large number of random variables Q = Q (is, /3) x C; Xm and replace the 
statistical expectation of (|20l) by an average over the so-generated random variables. As can 


be observed from Figure [lj the 3 approximations provide very close values, which enable one to 
validate the closed-form expressions in (40) and (43). 
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Approximations of a p 



Figure 1: Comparison of the approximations of a^ in (40) and (43). u = 1.5 and /3 = l/i 


3.3 Maximum Likelihood estimation 

We now focus on maximum likelihood (ML) estimation of direction of arrival </>o, signal wave¬ 
forms st p and covariance matrix R in the model 

= a((j)o)s tp + ^%n tp - t p = l,...,T p 
y t 3 = ^TTs u t 3 ; t s = l,...,T s (44) 

where Tt p , Tt„ ~ G {v, f3), and n tpl n ts ~ CM (0, R). The joint distribution of (X, Y) is given by 


p(X, Y) oc |R| _< ' Tp+Ts ' ) Y[ [y^R- 1 


y t s 


v-M 

2 


t 8 = 1 


I<M-u 2\y?K-iy ts /l3 


n [< r iz h 


u-M 

2 


K M -v ( 2 A /zf R 1 z t 


(45) 


where z tp = x tp — a ((j>)st . Joint estimation of all parameters appears to be very complicated 
and hence we will proceed in two steps. At first, we assume that R is known and derive the ML 
estimates of (j) and st . Then, R is substituted for some estimate obtained from observation of 
Y only. 


3.3.1 Do A estimation with known R 

Assuming that R is known, one needs to maximize with respect to cj) and s t p 

T p 

p(X) OC n 9 ([x*p - a(0)s tp ] H R _1 [x tp - a (</>)s tp \ ) (46) 

t p =l 
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where g(.) is given by (19). Since g{.) is monotonically decreasing, see (21), it follows that p(X) 
is maximized when the argument of g(.) is minimized. However, 

[xtp - a((/))st p ] H R _1 [xtp - a(^)s tp ] 

a- ff (0)R" 1 xt ' 2 


= [a ff (^)R x a(0)] 


Stp a 11 (fiRr 1 &{</>) 


+ xfR-'x, - 


ja^^R 1 x* 


a^(</>)R 1 a((j>) 
Therefore, for any <f>, p(X) is maximized when 

a H (^)R 1 x* 


Stp a^(0)R _1 a(</>)' 


It ensues that one needs now to maximize, with respect to 4> 


f(4>) = II 9 ( X S R lx *r 


\a H (<t>)R x tp | 
a // (^>)R~ 1 a(0) 


(47) 


(48) 


(49) 


with g(z) = z 2 K M - V {2y/z//3). In order to avoid calculation of a modified Bessel function and 
thus in order to simplify estimation, we propose to make use of the “large M — u v approximation 


of the modified Bessel function given in (37) to write 

I/-M 


g(z) = z 2 Km-v(2\/z/I3) 


u — M 

~ Z 2 X 


7T / e.yJJf]3 
2(M - v) \(M-v) 


~{M-u) 


= const. 2 


u—M 


(50) 


This approximation results in an approximate maximum likelihood (AML) estimator of (p which 
consists in maximizing 

■ 2 “I v—M 


m = n 


X ? R - 


Note that 


log /#) = {v ~ M) Y log 


t v =l 


|a g (^)R 1 ^t p 

a- f/ ((/>)R _1 a((/)) 

a H (<(>)R~ 1 xt p | 2 
a i? ((/>)R~ 1 a (<p) 


xf R ' x tr , - 


(51) 


(52) 


which should be compared to the concentrated log likelihood function in the Gaussian case, as 
given by 


logic#) = - Y 


t p =l L 


x c R ' x t„ - 


|a g (0)R ^Xfpj 

a H (<p)R^ 1 a (</>) 


(53) 


A few remarks are in order about these estimates, in particular about the behavior of the 
AML estimator in the case of unbounded FIM, i.e., when 0 < v < 1 . First, note that all 
estimates will be a function of 

12 


*( x tp#)= x £ R 1 x ip - 


\a H (</>)R 1 x t . 
a H (^)R _1 a(0) 

= X J R ' 1/2P R-l/2 a( ^) R ~ 1/2x ^ 


(54) 
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where P 


R _1 / 2 a(<^) 


is the projection onto the orthogonal complement of R U 2 a(i j>). Compared 
to (53), the logarithm operation in (51) will strongly emphasize those snapshots x^ p for which 
t(x.t ,(j>) is small. Let us thus investigate the properties of this statistic, when evaluated at the 
true value of signal DOA 4> o- Using the fact that R~ 1 // 2 x ip = R~ 1 // 2 aoSi p + ~ tp , where 

w fp ~ CAA(0,Im) and ao is a short-hand notation for a.(fa), one has 


tfat p ,fa) = r i P wJP^_ 1/2ao w tp = r tp x C Xm_i- (55) 

For small v (0 < v < 1), it follows that, in the vicinity of <f> o, the snapshot with minimal t(xj p , </>) 
is more or less the snapshot for which r* is minimum, hence the snapshot for which noise power 
is minimum, which makes sense. If we let ut p = mini<t r tp , then its cumulative density 
function (c.d.f.) is given by 

Pr [ u Tp < v] = 1 - (1 - Pr [n p <v]) Tp 

= 1 - [l-7(^77/3 _1 )] Tp (56) 

which is shown in Figure [2j Obviously, with small v, the snapshot which corresponds to the 
minimum value of r* exhibits a very high signal to noise ratio and, due to the emphasizing effect 
of the log operation in ( |51[ ), the performance of the AML estimator is likely to be driven mainly 
by this particular snapshot. This is illustrated in Figure [3] where we display the mean-square 
error (MSE) of the AML estimate which uses all T p snapshots and the MSE of an hypothetical 
AML estimator which would use only the snapshot x^ min corresponding to the minimum value 
of Tt . The scenario of this simulation is described in the next section. This figure shows a 
marginal loss of the AML estimator using xt min only, as compared to the full AML estimator, 
especially for small v. 


Cumulative Density Function of minr, 



Figure 2: Cumulative density function of mini<t <y r* . M = 16 and T p = 4. 


Let us thus analyze the behavior of the AML estimators. For the sake of notational conve¬ 
nience, let 4>k and <J)™ n denote the AML estimator using T p snapshots with Ji-distributed noise 
and the AML estimator using the snapshot Xf . corresponding to the minimal r+ , respectively. 


Observe that, when using a single snapshot xt min , minimizing (51) is equivalent to minimizing 


the Gaussian likelihood function in (53) with T p = 1. Since X( min exhibits a high signal to noise 
ratio, </>™ n is close to 4> o, one can make a Taylor expansion and relate the error </>™ n — <po to the 


error x+ . — ans+ 

t min u L n 


as 


4>K n -fa- y/ufa, 


(57) 
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Figure 3: Mean square error of AML estimator using either all snapshots or a single snapshot 
corresponding to minimal r* . R known, M = 16 and SNR = 3dB. 


where C is some vector that depends essentially on the derivatives of a(0) (37j and whose 
expression is not needed here. One can simply notice that £ would be the same with Gaussian 


noise and a single snapshot, since maximizing (51) or (53) is equivalent when one snapshot is 
used . This implies that 

£ {far - 0o) 2 } ^ £ M C H *C (58) 

Observe that C^RC is the mean-square error (MSE) that would obtained in Gaussian noise and 
a single snapshot, which is about T p times the MSE obtained in the Gaussian case and using 
T p snapshots, and the latter is approximately the Gaussian CRB. The MSE of depends on 
£ {ut p } where ut p is the minimum value of a set of T p independent and identically distributed 
(actually gamma distributed) variables. Therefore, in order to obtain £ {ut p }, one must consider 
statistics of extreme values, a field that has received considerable attention for a long time, see 
e.g., J38-40]. It turns out that only asymptotic (as T p —> oo) results are available and we build 
upon them to derive the rate of convergence of £ | (</>™ n — ^>o) 2 |- First, note that 


Pr 


T p /u u Tp > x 


= Pr 


UT P > T-^x 


(Pr [r tp >T-^x]) Tp 
<T~ 1/V x\) 
l-'y{u,r 1 T- 1 /v x) 


= 1 — Pr 


(59) 


Now since u is small and T p is large, T p 1 ^ I/ is very small and we can approximate q(a, y) — 
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ar(a)] 1 y a , which yields 


Pr 


T^u Tp > x 


1-7 (y,P l Tp 1,v x) 


1 - 


exp 


p- lJ T- x x v 


— Vrp—l v \ P 


= exp 


p- u T~ l x 
vT(v 

P~ v x u 

' vP{y) 


( 60 ) 


It follows that asymptotically, vt p = T^ v ut p converges to the distribution in (60), whose prob¬ 
ability density function is 


r v 


= exp ~ 


P~ v v v T 
_ _ 

vY(y) 


(61) 


Using integration by parts, it follows that 


roc 0 — 1 / 


£ { V T P }= I FTUP^exP^ “ 


-x exp 
= RiM v ~ 


T(u) 

p~ v x h 

'^F) 


p- v x v 

^FF 


dx 


OO /‘OO 


+ / ex P s — 


Jo JO 


(3-"x u 


dx 


/*00 

p v 1 l v ~ 1 T(v) l l v / z 1 l v -'exp{-z}dz 

Jo 

= pv 1 / v - 1 r(v) 1 / v r(v- 1 ) = c(v,p). 


(62) 


One can then conclude that, as T p goes to infinity, 

£ {(C in - 0o) 2 } - [C^R-C] C{v,P)T~ l /\ (63) 

Therefore, in the case of 0 < v < 1, the MSE of </>™ n decreases as T p 1 ^, a rate of convergence 
much faster than the usual Tp 1 . Note that this case corresponds to unbounded FIM. Such rates 
of convergence are also found with distributions possessing singularities 135], chapter 6]. 

As for the AML estimate obtained from T p snapshots, namely , its MSE is upper-bounded 
by that (j)™ n (since it uses all snapshots, including x tmin ), and is lower-bounded by the MSE 
that would be obtained if Tt p = ut p for t p = 1, ■ ■ • , T p . and this MSE is T~ x times the MSE 
of Additionally, as said before, we have C^RC — T p CRB q (<f>) where CRBq‘( fo) is the 

T 

Gaussian CRB using T p snapshots. Hence, one can bound the MSE of <f>^ as 

CRBq p {( f>o)C{u,P)T~ 1/v < f | (4 P - 4) 2 } < CRB%(cl )0 )C(v,P)T- 1 /‘'+ 1 . (64) 

As will be illustrated in the next section, the upper bound is rather tight, while the lower bound 
is much lower than the actual MSE. 
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3.3.2 Estimation of R using secondary data 


When R is not known, then the secondary data Y can be used to estimate it. The maximum 
likelihood estimator is obtained (for T > M ) as the solution (up to a scaling factor) to the 
following implicit equation [27 


1 

Rml = y E $ (y£ R MLy u) yt s y? s 


t a = 1 


/3 1 / 2 T s 


£ (y^R-m.y t s ) 


- 1/2 


Km+i-u lyts)/P 


ts = 1 


Km-u (2 v /(y^RM 1 L y ts )/^ 


yt a y t 


H 


(65) 


Rml can be obtained through an iterative procedure, whose convergence is guaranteed under 
the assumptions made 27 . In order to avoid evaluation of the modified Bessel function, one 


can use the large M — v approximation of Km+ i-v (z) /Km-u (z) in (41) to define Raml as the 
solution to the fixed-point solution 


Raml = 


(M + 1 — i/) 1 / 2 (M — u) 1 / 2 


yt s yts 


5 y" R AM L yi, 


( 66 ) 


Note that Raml is more or less the well-known Tyler fixed-point estimator |41 , which again can 
be obtained from an iterative procedure whose convergence is guaranteed 129,42 . The drawbacks 
of the two above estimators are that l)they are suited to a K distribution for the noise and 2 )T S 
is required to be larger than M. In order to gain robustness against these problems, a solution 
is to use normalized data z ts = y t J ||y ts || whose distribution is independent of that of the noise, 


and to use regularization. More precisely, we suggest to resort to the following scheme |43-45 


M 


Rfe+ i(v) = (! ~v)y 




s t.= i z¥ 


liv) = 


M 


Tr{R fc+ i(r 7 )} 


R -k(v) 

Rfc+iW- 


i -i 


+ rjl M 


z t s 


(67a) 

(67b) 


and define Rracg( 7 ?) = hnifc^oo R f.(rj) since convergence of this iterative scheme has been proved 
[461. The very good performance of this scheme has been illustrated in various applications, see 
e.g., 144-48), where discussions on how to select the regularization parameter r/ can also be found. 


4 Numerical simulations 


We assume a linear array of M = 16 elements spaced a half-wavelength apart and we consider 
the simple scenario of a single source impinging from (f> o = 10° embedded in unit power K- 
distributed noise. The covariance matrix R is given by R (k, £) = with p = 0.99. The 

exact and approximate maximum likelihood estimators, which consists in maximizing f(4>) in 


(49) f{4>) in (51) were implemented using the Matlab function fminbnd, and the maximum 
was searched in the interval [</>o — 2(j)^dB,4’o + 2 </> 3 <ib] where 03 dB is the half-power beamwidth 
of the array. The signal waveform was generated from i.i.d. Gaussian variables with power 
P and the signal to noise ratio (SNR) is defined as SNR = P(a^R~ 1 ao). The asymptotic 
Gaussian CRB, multiplied by the scalar a\/M was used as the bound for Redistributed noise. 


For the regularized covariance matrix estimator Rracg( 7 ?) of (67), the value of r/ was set to 
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r] = 0.01. 1000 Monte-Carlo simulations were used to evaluate the mean-square error (MSE) of 
the estimates. 

In Figures [4] and [ 5 ] we plot the CRB (for v > 1) or the lower and upper bounds of (25) when 
v < 1, as well as the MSE of the ML and AML estimators, as a function of T p . and compare 
the case where R is known to the case where it is estimated from (67) with T s = 32 snapshots 
in the secondary data. The following observations can be made: 


• there is almost no difference between the MLE and the AMLE, and therefore the latter 
should be favored since it does not require evaluating modified Bessel functions. 


• the MSE in the case where R is known is lower than that when R is to be estimated, which 
is expected. However, the difference is smaller when v < 1: in other words, it seems that 
adaptive whitening is not so much penalizing with small v while it seems more crucial for 
v > 1. Indeed, for small v, what matters most is the fact that some snapshots are nearly 
noiseless, and this is more influential than obtaining a very good whitening. 


• the decrease of the MSE for u > 1 is roughly of the order T p 1 . When v < 1, this rate is 

significantly increased and the MSE decreases very quickly as T p 1 ^ i/ , as predicted by the 
analysis above. This rate of convergence is also observed in Figure [6] where we consider a 
scenario with two sources at <f> = 10°, 12°. 


the upper bound in (25) seems to provide quite a good approximation of the actual MSE, 
at least for T p large enough. 


The influence of T s is investigated in Figure [7j where one can observe that about T s = 64 is 
necessary for the performance with estimated R to be very close to the performance for known 
R. However, as indicated above, this is less pronounced when 1 / < 1, where the difference 
becomes smaller with lower T s . 

Finally, we investigate whether the rate of convergence of the MLE or AMLE when v varies 
is impacted by a small amount of Gaussian noise. More precisely, we run simulations where the 
data is generated as 


x ip = a(^ 0 )s tp + yf{l - «) v / R(( Rl/2w ip + ( 68 ) 


where w~ CAA(0 ,Im-), i.e., the noise is a mixture of Redistributed noise and Gaussian 
distributed noise. The covariance matrix of the noise is now Ra'+g = (1 — a)R + ad and we use 
the AML estimator assuming that the noise has a K distribution with parameter v and known 
covariance matrix R/^+g- I n Figure |8j we display the MSE of the AML estimator versus T p and 
versus v for different values of a. Clearly, the rate of convergence of the estimator is affected 
by a small amount of Gaussian noise, even when v is small. This indicates that, if noise is not 
purely A'-distributed with small v, we recover the usual behavior of the MSE versus T p . 


5 Conclusions 

In this paper we addressed the DoA estimation problem in Redistributed noise using two data 
sets. The main result of the paper was to show that, when the shape parameter v of the texture 
Gamma distribution is below 1, the FIM is unbounded. On the other hand, for v > 1, the FIM 
is bounded and we derived an accurate closed-form approximation of the CRB. The maximum 
likelihood estimator was derived as well as an approximation, which induces non significant 
losses compared to the exact MLE. In the non regular case where v < 1, we derived lower and 
upper bounds on the mean-square error of the (A)ML estimates and we showed that the rate of 
convergence of these (A)ML estimates is about T p 1 ^ I/ where T p is the number of snapshots. 
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Figure 4: Cramer-Rao bounds and mean square error of estimators versus T p with either R 
known or estimated. M = 16, SNR = 3dB, T s = 32 and v > 1. 
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Figure 5: Mean square error of estimators versus T p with either R known or estimated. M = 16, 
SNR = 3dB, T s = 32 and u < 1. 
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Figure 6: Mean square error of AML estimator in a two sources scenario with cj) = 10°, 12°. R 
known, M = 16 and SNR = 3dB. 
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Figure 7: Cramer-Rao bounds and mean square error of estimators versus T s . M = 16, SNR 
3dB, T p = 16 and varying v. 
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Figure 8: Mean square error of AMLE versus T p in the case of a mixture of A'-distributed and 
Gaussian distributed noise. M = 16, SNR = 3dB, and varying a. 
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